65 research outputs found

    #Santiago is not #Chile, or is it? A Model to Normalize Social Media Impact

    Full text link
    Online social networks are known to be demographically biased. Currently there are questions about what degree of representativity of the physical population they have, and how population biases impact user-generated content. In this paper we focus on centralism, a problem affecting Chile. Assuming that local differences exist in a country, in terms of vocabulary, we built a methodology based on the vector space model to find distinctive content from different locations, and use it to create classifiers to predict whether the content of a micro-post is related to a particular location, having in mind a geographically diverse selection of micro-posts. We evaluate them in a case study where we analyze the virtual population of Chile that participated in the Twitter social network during an event of national relevance: the municipal (local governments) elections held in 2012. We observe that the participating virtual population is spatially representative of the physical population, implying that there is centralism in Twitter. Our classifiers out-perform a non geographically-diverse baseline at the regional level, and have the same accuracy at a provincial level. However, our approach makes assumptions that need to be tested in multi-thematic and more general datasets. We leave this for future work.Comment: Accepted in ChileCHI 2013, I Chilean Conference on Human-Computer Interactio

    Informal and non-formal music experience: Power, knowledge and learning in music teacher education in Chile

    Get PDF
    Previous research recognises the importance of musical experiences on music teacher education. However, current efforts do not provide a comprehensive view of the way their students learn music before starting university. The objective of this study is to portray their musical experiences, identifying the distinctive mechanisms underlying the relationship between practices, repertoires and training contexts for music learning. A combination of pedagogical, social and musical dimensions, inspired by sociological theories of P. Bourdieu and B. Bernstein, examine the pre-university musical experiences and the mediating role of students’ sociocultural origins. Empirically, multimodal information from four Chilean universities (n=55) was collected through the application of a survey questionnaire and semi-structured interviews, and analysed using a set of mixed techniques, including descriptive statistics, text mining and content analysis. Findings reveal relevant associations between practices, repertoires and learning contexts, especially in terms of the specialized nature of musical training and the habitus and cultural dispositions of practitioners. Particularly relevant is the predominance of informal and non-formal learning contexts and their translation into specific types of learning. These challenge current perspectives and contribute a tool kit for the understanding of the relationship between power and knowledge in future professional teachers

    Combining strengths, emotions and polarities for boosting Twitter sentiment analysis

    Get PDF
    Twitter sentiment analysis or the task of automatically retrieving opinions from tweets has received an increasing interest from the web mining community. This is due to its importance in a wide range of fields such as business and politics. People express sentiments about specific topics or entities with different strengths and intensities, where these sentiments are strongly related to their personal feelings and emotions. A number of methods and lexical resources have been proposed to analyze sentiment from natural language texts, addressing different opinion dimensions. In this article, we propose an approach for boosting Twitter sentiment classification using different sentiment dimensions as meta-level features. We combine aspects such as opinion strength, emotion and polarity indicators, generated by existing sentiment analysis methods and resources. Our research shows that the combination of sentiment dimensions provides significant improvement in Twitter sentiment classification tasks such as polarity and subjectivity

    Meta-level sentiment models for big social data analysis

    Get PDF
    People react to events, topics and entities by expressing their personal opinions and emotions. These reactions can correspond to a wide range of intensities, from very mild to strong. An adequate processing and understanding of these expressions has been the subject of research in several fields, such as business and politics. In this context, Twitter sentiment analysis, which is the task of automatically identifying and extracting subjective information from tweets, has received increasing attention from the Web mining community. Twitter provides an extremely valuable insight into human opinions, as well as new challenging Big Data problems. These problems include the processing of massive volumes of streaming data, as well as the automatic identification of human expressiveness within short text messages. In that area, several methods and lexical resources have been proposed in order to extract sentiment indicators from natural language texts at both syntactic and semantic levels. These approaches address different dimensions of opinions, such as subjectivity, polarity, intensity and emotion. This article is the first study of how these resources, which are focused on different sentiment scopes, complement each other. With this purpose we identify scenarios in which some of these resources are more useful than others. Furthermore, we propose a novel approach for sentiment classification based on meta-level features. This supervised approach boosts existing sentiment classification of subjectivity and polarity detection on Twitter. Our results show that the combination of meta-level features provides significant improvements in performance. However, we observe that there are important differences that rely on the type of lexical resource, the dataset used to build the model, and the learning strategy. Experimental results indicate that manually generated lexicons are focused on emotional words, being very useful for polarity prediction. On the other hand, lexicons generated with automatic methods include neutral words, introducing noise in the detection of subjectivity. Our findings indicate that polarity and subjectivity prediction are different dimensions of the same problem, but they need to be addressed using different subspace features. Lexicon-based approaches are recommendable for polarity, and stylistic part-of-speech based approaches are meaningful for subjectivity. With this research we offer a more global insight of the resource components for the complex task of classifying human emotion and opinion

    Biosynthesis and characterization of a novel, biocompatible medium chain length polyhydroxyalkanoate by Pseudomonas mendocina CH50 using coconut oil as the carbon source

    Get PDF
    This study validated the utilization of triacylglycerides (TAGs) by Pseudomonas mendocina CH50, a wild type strain, resulting in the production of novel mcl-PHAs with unique physical properties. A PHA yield of 58% dcw was obtained using 20g/L of coconut oil. Chemical and structural characterisation confirmed that the mcl-PHA produced was a terpolymer comprising of three different repeating monomer units, 3-hydroxyoctanoate, 3-hydroxydecanoate and 3-hydroxydodecanoate or P(3HO-3HD-3HDD). Bearing in mind the potential of P(3HO-3HD-3HDD) in biomedical research, especially in neural tissue engineering, in vitro biocompatibility studies were carried out using NG108-15 (neuronal) cells. Cell viability data confirmed that P(3HO-3HD-3HDD) supported the attachment and proliferation of NG108-15 and was therefore, confirmed to be biocompatible in nature and suitable for neural regeneration

    Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer.

    Get PDF
    To identify common alleles associated with different histotypes of epithelial ovarian cancer (EOC), we pooled data from multiple genome-wide genotyping projects totaling 25,509 EOC cases and 40,941 controls. We identified nine new susceptibility loci for different EOC histotypes: six for serous EOC histotypes (3q28, 4q32.3, 8q21.11, 10q24.33, 18q11.2 and 22q12.1), two for mucinous EOC (3q22.3 and 9q31.1) and one for endometrioid EOC (5q12.3). We then performed meta-analysis on the results for high-grade serous ovarian cancer with the results from analysis of 31,448 BRCA1 and BRCA2 mutation carriers, including 3,887 mutation carriers with EOC. This identified three additional susceptibility loci at 2q13, 8q24.1 and 12q24.31. Integrated analyses of genes and regulatory biofeatures at each locus predicted candidate susceptibility genes, including OBFC1, a new candidate susceptibility gene for low-grade and borderline serous EOC

    ABSTRACT A Content and Structure Website Mining Model

    No full text
    We present a novel model for validating and improving the content and structure organization of a website. This model studies the website as a graph and evaluates its interconnectivity in relation to the similarity of its documents. The aim of this model is to provide a simple way for improving the overall structure, contents and interconnectivity of a website. This model has been implemented as a prototype and applied to several websites, showing very interesting results. Our model is complementary to other methods of website personalization and improvement

    Website privacy preservation for query log publishing

    No full text
    In this paper we study privacy preservation for the publication of search engine query logs. In particular, we introduce a new privacy concern, which is that of website privacy (or business privacy). We define the possible adversaries that could be interested in disclosing website information and the vulnerabilities found in the query log, from which they could benefit. In this work we also detail anonymization techniques to protect website information, and explore the different types of attacks that an adversary could use. We then present a graph-based heuristic to validate the effectiveness of our anonymization method, and perform an experimental evaluation of this approach. Our experimental results show that the query log can be appropriately anonymized against a specific attack for website exposure, by only removing approximately 9 % of the total volume of queries and clicked URLs
    corecore